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ABSTRACT 


We  study  the  functional  errors-ln-variables  regression  model.  In 
the  case  of  no  equation  error  {all  randomness  due  to  measurement 
errors),  the  maximum  likelihood  estimator  computed  assuming 
normality  is  asymptotically  better  than  the  usual  moments  estimator, 
even  If  the  errors  are  not  normally  distributed.  For  certain 

statistical  problems  such  as  randomized  two  group  analysis  of 

« 

covariance,  the  least  squares  estimate  is  shown  to  be  better  than 
the  aformentloned  errors-in-varlables  methods  for  estimating  certain 
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1.  Introduction 


The  problem  of  estimating  linear  regression  parameters  when  the  variables  are 
subject  to  measurement  or  observation  error  has  a  long  history  and  has 
recently  been  the  focus  of  considerable  attention.  Reilly  and  Patino-Leal 
(1981)  list  a  number  of  recent  publications  concerning  situations  in  which 
the  problem  arises;  see  Wu,  Ware  and  Felnlieb  (1980)  for  a  simple  but 
particularly  interesting  example  In  a  biomedical  context.  Blomquist  (1977), 
Nussbaum  (1980),  Fuller  (1980)  and  Gleser  (1981)  have  recently  addressed 
various  theoretical  aspects  of  the  problem. 

The  purposes  of  this  paper  are  three.  First,  by  exploiting  a  particular 
representation  of  estimators  we  unify  and  extend  some  of  the  asymptotic 
results  for  the  normal  theory  maximum  likelihood  estimator  ( normal Ity-MLE) 
and  the  "method  of  moments"  estimators  developed  by  Fuller  (1980).  Second, 
having  obtained  the  asymptotic  distributions  of  the  method  of  momemts 
estimators  and  the  normal Ity-MLE,  we  are  In  a  position  to  compare  the  two  via 
limiting  variances.  In  a  particular  important  special  case,  we  are  able  to 
show  that  the  normal ity-MLE  Is  better  than  the  method  of  moments  estimator  In 
the  sense  of  having  an  asymptotic  normal  distribution  centered  about  the  true 
regression  parameter  and  with  smaller  asymptotic  variance.  This  Is  perhaps 
not  too  surprising  at  the  normal  model,  but  It  in  fact  holds  even  If 
assumptions  of  normality  are  violated.  Our  Monte-Carlo  study  confirms  this 
result,  but  we  also  discuss  reasons  why  one  would  want  to  use  the  method  of 
moments  estimator  In  practice,  especially  when  using  Fuller's  small  sample 
modification. 

The  third  major  purpose  of  this  paper  Is  to  study  the  least  squares  estimator 
(LSE),  computed  as  if  the  variables  were  observed  exactly.  The  LSE  Is 
generally  Inconsistent  for  regression  parameters,  and  thus  has  not  been 
considered  much  in  the  literature.  This  Is  unfortunate  because,  as  has  not 
been  generally  recognized,  there  are  Important  statistical  problems  In  which 
the  LSE  ^s  consistent;  one  example  Is  two-group  analysis  of  covariance  for  a 
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randomized  study,  where  the  LSE  consistently  estimates  the  treatment  effect 
difference  even  when  there  are  errors  in  the  variables.  A  heuristic 
asymptotic  result  suggests  that  when  the  LSE  is  consistent  for  a  particular 
contrast,  it  will  be  better  than  the  normal ity-MLE  In  large  samples.  The 
conjecture  Is  explicitly  confirmed  for  two  group  analysis  of  covariance.  Our 
small  Monte-Carlo  study  is  illuminating  here. 

There  are  two  other  features  of  the  paper  which  are  important.  First,  to  the 
best  of  our  knowledge  the  Monte-Carlo  results  are  among  the  first  of  their 
kind  for  the  errors-in-variables  problems  we  consider,  although  Wolter  and 
Fuller  (1982  a,b)  have  Monte-Carlo  as  well.  Second,  the  Monte-Carlo  study 
includes  recently  introduced  generalizations  of  M-estimates  (Carroll  and 
Gallo  (1982)),  which  we  show  to  work  quite  well. 

2.  Models,  Assumptions  and  Estimates 

We  consider  a  general  errors-Irt-varlabTes  (El V)  regression  model  in  which 
some  subset  of  the  variables  is  subject  to  error,  while  some  are  observed 
exactly;  the  response  Is  replicated  s  times  and  the  predictor  variables 
subject  to  measurement  error  are  replicated  r  times.  Specifically, 


Xj  8^  X2  ^1 


ej  *  «  +  Vi 


x2  +  Uj 


1  *  1. 

j  *  1. 


r. 


Here,  Bj  Is  a  (pj  x  1)  vector  and  Bg  Is  (p2  x  1),  p  *  pj  +  p2.  The  vectors 
6,  and  are  of  dimension  (N  x  1),  where  N  is  the  sample  size  In  the 
study.  Xj  and  X2  are  constant  matrices  of  order  (N  x  pj)  and  (N  x  p2), 
respectively.  Xj  Is  observable,  however,  because  of  measurement  error  Uj,  X2 
Is  not  observable  but  rather  the  (N  x  p2)  matrices  Cj  are  observed.  The 
(N  x  1)  random  vector  6  Is  called  the  equation  error,  while  the  {V^ )  are  the 
measurement  errors  In  the  response.  The  assumption  that  and  X2  are 
constant  puts  us  in  the  functional  EIV  model.  In  some  cases  we  will  assume 


no  equation  error  (6  =0),  in  which  case  we  have  the  classical  linear 
functional  relationship.  The  concept  of  equation  error  was  introduced  by 
Fuller  (1980). 

We  assume  that  the  (V^ }  are  mutually  Independent  and  Independent  of  6  as  are 
the  {U.}.  The  elements  of  6  and  those  of  each  V.  are  1.1. d.  with  zero  mean 

j  2  2  1 

and  finite  variances  and  ov  respectively,  while  the  rows  of  each  Uj  are 
1.1. d.  with  mean  zero  and  non-singular  covariance  matrix  iu.  We  define 


X  =  [Xj  x2] 


and  assume  that  X  is  of  full  column  rank  such  that 

a  *  N”1  X'  X  is  positive  definite.  (2.1) 

N+® 

We  further  define 


Where  Ajj  and  the  upper  left-hand  submatrix  of  zeroes  In  Eu*  are  (p^  x  p^). 

We  will  discuss  a  number  of  special  cases  of  our  EIV  model  and  define  an 
estimator  of  B  In  each. 

Case  No.  1  (No  replication)  Gleser  (1981)  and  the  majority  of  researchers 
assume  no  replication  Is  available  (r  *  s  *  1).  Here,  we  suppress  the 
subscripts  referring  to  the  replicates  and  write  Y,  e,  C,  etc.  We  assume 
that  the  rows  of  [U  e]  have  finite  fourth  moments  and  that  a  matrix  zQ  Is 
known  such  that 
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we  define  E0*.  £uo*,  £cU0*  correspondingly. 

With  1^  the  Identity  matrix  of  order  N,  write 
R  *  lH  -  Xi  (X*  Xj)’1  X[ 

W  *  [C  Y3‘  R  [C  Y]. 

Let  e  be  the  smallest  eigenvalue  of  £Q~*W.  If  Ci  C*  -  0  lQic  Is  non-slngluar 
(Gallo  (1982b)  has  shown  that  this  holds  a.s.  If  the  error  distribution  Is 
absolutely  continuous),  we  define 

i„  *  (c;  c.  - « i^.r1  ic;  v  -  e  tcli0„)  12.3) 

This  estimate  Is  the  maximum  likelihood  estimate  for  jointly  normally 
distributed  errors  (note:  If  we  omit  assumption  (2.2),  the  supremum  of  the 
1 '  .11  hood  Is  Infinite).  The  estimate  was  derived  In  a  more  general 
framework  by  Healy  (1975)  and  was  shown  by  Gleser  (1981)  to  be  equivalent  to 
a  generalized  weighted  least  squares  estimate.  We  emphasize  that  we  will 
study  (2.3)  and  the  other  estimates  of  8  without  assuming  normality. 


Case  No.  2  (Equal  replication)  Here  we  let  s  *  r  >  1.  The  equal 
replication  Is  convenient  since  It  admits  simpler  notation,  but  It  Is  by  no 
means  necessary.  It  does  arise  In  practical  circumstances.  For  example.  If 
one  predictor  Is  baseline  diastolic  blood  pressure  the  response  Is  change  In 
diastolic  blood  pressure,  as  In  the  Framingham  Heart  Study  of  the  National 
Heart,  Lung  and  Blood  Institute  and  in  other  studies,  a  common  practice  Is  to 
take  one  replicate,  l.e.,  r  •  s  ■  2.  A  method  of  moments  estimator  motivated 
by  the  work  of  Fuller  (1980)  Is 

I 

L  -  (  I  £  Cl*  ck*)_1  E  E  C'  Y.  (2.4) 

^  )'Wml  imW'1 


h-  *  -mMaUSt 


i 


We  assume  that  the  random  matrices  6,  [ej  Uj] .  [er  U  1  are  mutually 

independent  and  that  the  other  specifications  of  Case  No.  1  hold.  (The 
normal Ity-MLE  has  not  been  calculated  for  a  case  such  as  this  In  which  6  *  o) 

Case  No.  3  (Equal  replication,  no  equation  error)  This  Is  the  same  situation 
as  in  Case  No.  2  except  that  8*0,  i.e.,  apart  from  measurement  error  the 
underlying  relation  is  exactly  linear. 

Let  eR  be  the  smallest  eigenvalue  of  Tj~*  T2>  where 

Tl*  jjKiV  <6iJ  -  r'1»  >.  KJ  V 

To  -  Tj  +  r”1  (  l  [C,  Yf ]•)  R  (  E  [C,  Y.]) 

6  1*1  1  1  i*l  1  1 

(6^  is  the  Kronecker  delta,  the  indicator  of  i  *  j).  Then  with 

n1j  *  (r  (eR  '  11  '  V 

we  define 

-  r  r  .  r  r 

0MB  *  (  l  l  C'  C.*)-1  E  E  m..  C'  Y.  (2.5) 

™  J-l  k=l  jk  j  k  j=l  k=l  Jk  J  k 

This  is  the  normality-MLE  in  the  replication  case,  and  has  been  derived  by 
Anderson  (1951)  and  Healy  (1980).  Note  that  assumption  (2.2)  is  unnecessary 
here. 

The  estimates  In  all  cases  above  have  been  shown  to  be  consistent  for  8  as 
N-»«*j  conditions  on  X  weaker  than  (2.1)  were  obtained  by  Gallo  (1982b). 

In  Case  No.  2,  8  has  been  shown  by  Fuller  (1980)  to  have  a  limiting  normal 
distribution  when  U  and  e  are  normally  distributed;  under  non-normality. 
Fuller  (1975)  has  some  related  results,  although  our  proofs  are  different. 

The  W.E  in  Case  No.  1  was  demonstrated  by  Gleser  (1981)  to  be  asymptotically 
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1 


normal,  but  the  proof  contains  a  slight  error.  (In  particular,  Gleser's  Lemma 

4.1  is  contradicted  by  the  following  example:  let  {yt>  be  a  sequence  of 

k/2  *  k/2 

Independent  random  variables  assuming  values  -2  ,  0,  2  '  with  probabilities 

2~(k+l)^  2”^+^,  respectively.  According  to  the  lenina,  N~*^  i  yfe 

Is  asymptote cally  standard  normal,  yet  It  can  be  shown  that  this  quant*ft^  Is 
op  (1).)  A  simple  remedy  would  require  that  the  errors  have  finite  moments 
of  order  greater  than  four,  an  assumption  we  would  like  to  avoid  if  possible. 

Finally,  there  are  practical  problems  where  It  Is  known  In  advance  that 

z  „  *  0.  In  this  Instance,  the  estimators  (2.3)  and  (2.5)  can  be  altered  to 
eu 

a  form  In  which  they  are  more  efficient.  Our  main  qualitative  comparisons 
and  conclusions  (Sections  4-5)  are  unaffected. 

3.  Asymptotic  Normality 

In  this  section  we  state  the  form  of  the  asymptotic  distributions  of  the 
estimators.  The  proofs  are  technical  and  are  delayed  until  Section  6. 

Theorem  1 

1/2  * 

(No.  1)  In  Case  No.  1,  N  '  (8^-6)  Is  asymptotically  normally  distributed 
with  mean  zero.  If  the  third  and  fourth  moments  of  the  joint  distribution 
of  the  rows  of  U  and  e  are  the  same  as  those  of  the  normal  distribution  then 
the  asymptotic  covariance  matrix  of  N  '  (8^  -  8)  Is 

-  d  U"1  +  A“l[""o  ola-1}  (3.1) 

L°  <U  • 

where 

d  »  [&2  -l3  £  £$2  -l]’  , 

o  •  <ti„2  i-1  tif2  *rr». 


Cov  (8^) 


i 


1  /?  * 

(No.  2)  In  Case  No.  2.  N '  (SR  -  8)  is  asymptotically  normally 
distributed  with  mean  zero  and  covariance  matrix 

Cov  (8R)  =  A'1  (r_1  (d-a| )  A  +  o26  (A  +  r-1  Ey*) 

112  -1  (3.2) 

+  r-1  (r-1)  1  ((d -a\)  Eu*  +  DR))  A  \ 

where 

D„  =  (E  *  8  -  E  *)  (E  *  8  -  £...*)'. 

R  '  u*  eu*  u*  eu* 

1/2  * 

(No.  3)  In  Case  No.  3,  N  (8R  -  8)  is  asymptotically  normally 
distributed  with  mean  zero  and  covariance  matrix 

Cov  (8^)  =  (dr-1)  a"1  +  (r-1)"1  A-1  jjQ  oj  a'1  (3.3) 

Again,  note  that  although  two  of  our  three  estimates  are  normality -ML E’s,  we 
do  not  assume  in  any  part  of  Theorem  1  that  the  errors  are  normally 
distributed.  The  assumption  made  In  part  (No.  1)  of  the  theorem  that  the 
error  distribution  moments  are  those  of  the  normal  distribution  Is  not 

A 

necessary  for  the  asymptotic  normality  of  8^;  nevertheless,  the  limit  variance 
depends  on  the  third  and  fourth  error  moments  and  is  in  general  quite 
unwieldy.  In  stating  the  theorem  we  thus  assume  that  the  moments  are  those  of 
the  normal  distribution  (as  did  Gleser  (1981))  since  this  yields  a  concise 
expression  more  easily  compared  with  those  of  other  estimates.  We  have  made 
no  further  assumptions  on  the  errors  beyond  those  of  Section  2;  in  particular, 
in  Cases  No.  2  and  3  we  require  only  two  finite  moments. 

4.  Comparisons  for  the  linear  Functional  Relationship 

We  consider  In  this  section  Case  No.  3,  the  linear  functional  relationship 
with  no  equation  error.  In  this  case  the  asymptotic  covariances  (3.2)  and 
(3.3)  are  comparable. 


8 


Theorem  2  For  Case  No.  3,  (even  under  non-normal  distributions),  the 
normality -ML E  (2.5)  is  asymptotically  no  worse  than  the  moment  estimator 
(2.4),  i.e., 

Cov  ( 8^ )  -  Cov  ( R 
is  positive  semi-definite. 


The  proof  is  given  in  Section  7.  However, 

2 

where  the  result  is  obvious,  when  I  =  o  1 


there  is  an  important  special  case 


P  2+l‘ 


Then 


Cov  (8R)  -  Cov  (6^) 


of  course,  what  is  most  interesting  about  Theorem  2  is  that  the  normal ity-MLE 
is  the  (asymptotic)  winner  over  method  of  moments  even  at  non- normal 
distributions.  To  get  some  idea  of  whether  this  result  holds  in  small 
samples,  we  performed  the  following  Monte-Carlo  study. 


All  calculations  were  done  at  the  NIH  computing  center.  Random  numbers  were 
generated  using  the  IMSL  routines  GGNPM  and  GGUBS.  There  were  500  Monte-Carlo 
replications.  The  true  model  was  simple  linear  regression  following  the 
format  of  Section  2  with  r=s=2  replications.  The  intercept  was  10  and  the 
slope  was  -4.  In  the  notation  of  Section  2,  Xj  is  a  column  vector  of  N=40 
ones,  »  10,  02  *  **  and  is  a  column  vector  obtained  as  the  values  of  X2 
from  Table  1  of  Jobson  and  Fuller  (1980). 


Although  the  estimates  were  calculated  in  the  forms  which  do  not  assume 

ieU  *  0,  we  fixed  ee1|  *  0  and  performed  the  Monte-Carlo  study.  The  rows  of 

the  error  terms  S,  Vi  and  were  thus  generated  Independently;  all  three  were 

either  normally  distributed  or  had  a  contaminated  normal  distribution.  In 

2  2 

general,  any  random  variable  was  either  N(0,  o  )  (Normal)  or  N(0,  o  )  with 

2  2 

probability  0.9  and  N(0,  c  a  )  with  probability  0.1  (Contaminated  Normal), 
wi  th 
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Equation  Error  (5): 
Y  Measurement  Error  (V): 


2 

°6 

2 

0 .. 


0  or  4,  c=5 
4,  c=5 


p 

X  Measurement  Error  (U):  ou  =  1,2  or  4,  c=3 

2 

The  normality-MLE  is  calculated  assuming  0.  =  0,  so  the  results  for 
2  0 

o5  *  4  give  some  idea  of  its  robustness  against  this  assumption. 

The  normality-MLE  was  computed  by  (2.5).  The  moments  estimator  was 
(2.4),  with  exception  that  the  term  in  (2.4) 


r  r 

Z  £  Cj*  Cl* 
j=l  k=l  J  * 
j*k 

was  replaced  by  Fuller's  modification  (Fuller,  (1980),  page  414-415)  using 
Fuller's  a=l.  This  modification  is  crucial  to  get  the  best  performance  of  the 
moments  estimator.  Although  this  modification  occurs  with  negligible 
probability  asymptotically,  we  found  in  our  study  that  it  occurred  often. 


Finally,  we  studied  a  generalization  of  the  moments  estimator  (2.4)  introduced 
by  Carroll  and  Gallo  (1982)  and  designed  to  be  robust  against  departures  from 
normality  in  6  and  V.  If  Fuller's  modification  was  necessary,  we  used  his 
estimator.  Otherwise,  we  solved 


0  =  if1  Zi2  f  (  Y11'BrS2  Z11  ) 

0 


+  Zn  T  /  ¥,,-6,-8,  Z 


(  '12"pl~°2  *12  ^ 


(4.1) 


where  (Z^j}  are  the  individual  elements  of  Cj  and  o  is  the  median  absolute 
residual  from  the  moments  fit  divided  by  .6745.  It  is  easy  to  solve  (4.1)  by 
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iteratively  reweighted  least  squares,  although  some  care  must  be  taken.  To 
this  date  there  is  no  estimator  with  distributional  robustness  properties 
which  includes  as  a  special  case  the  normal ity-MLE,  although  such  an  estimator 
will  surely  be  soon  discovered.  For  Case  No.  1,  Brown  (1982)  has  suggested 
such  a  class,  but  his  proof  of  consistency  is  in  error  and  his  estimator  is 
not  consistent  and  asymptotically  normal  in  general. 

Mean  square  error  (MSE)  efficiencies  relative  to  the  normal i ty -ML E  are  given 
in  Table  1,  along  with  the  percentage  of  times  Fuller's  modification  was  used. 
Efficiencies  are  also  given  in  Table  2  for  the  95th  percentile  of  the  absolute 
errors  for  the  different  estimators.  The  first  twelve  lines  of  each  table  are 

p 

for  the  situation  of  no  equation  error  (6  a  og  =  o),  assumed  in  calculating 
the  normality  MLE.  Note  as  suggested  by  Theorem  2  that  the  normal ity-MLE 
generally  dominates  the  moments  estimator  (but  not  vastly  so),  even  at  non¬ 
normal  distributions.  The  Carroll -Gallo  estimator  is  generally  the  best,  even 
when  compared  to  the  normal ity-MLE  and  even  though  it  is  meant  to  improve  the 
moments  estimator,  not  the  normal ity-MLE.  The  Carroll-Gallo  estimator  does 
lose  some  efficiency  when  the  measurement  error  in  becomes  very  large ;  this 
is  a  reflection  of  the  fact  that  the  "asymptotically  negligible"  Fuller 
modification  is  needed  301-50%  of  the  time.  Clearly,  it  would  be  helpful  to 
have  a  distribution- robust  general ization  of  the  normal ity-MLE.  Further  work 
should  also  focus  on  bounded  influence  (Carroll  and  Gallo  (1982)  make  one 
simple  suggestion  along  these  lines). 

p 

The  last  twelve  lines  of  Tables  1  and  2  reflect  the  situation  =  4,  i.e., 

there  is  substantial  equation  error.  Here  the  normal i ty-ML E  calculated 
2 

assuming  og  *  0  does  particularly  poorly.  Clearly,  the  normal ity-MLE  is  not 
robust  against  violations  of  the  linear  functional  relationship  (no  equation 
error).  Certainly,  the  Monte-Carlo  suggests  the  need  for  calculation  and 
study  of  the  normal i ty-ML E  in  the  general  Case  No.  2.  In  the  absence  of  such 
a  general  estimator,  we  would  in  practice  favor  the  moments  estimator  (2.4) 
with  Fuller's  modification  over  the  estimator  (2.5),  especially  for  small 
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X" 


samples  or  if  substantial  equation  error  is  a  possibility.  Wolter  and  Fuller 
(1982  a,b)  also  present  Monte-Carlo  which  emphasizes  that  the  moments 
estimator  can  be  superior  in  practice. 

We  consider  the  results  for  the  Carrol 1-Gallo  estimator  to  be  very 
encouraging,  but  further  development  is  clearly  needed.  On  an  interim  basis, 
our  estimators  should  be  considered  a  supplement  to  and  not  a  replacement  for 
Fuller's  modified  MME. 


5.  Comparisons  with  the  Least  Squares  Estimator 

It  is  well-known  that  the  ordinary  least  squares  estimate  computed  as  if  the 
observed  values  C*  were  the  exact  values  of  interest. 


\  =  (c;  c*)_1  c;  y, 

is  inconsistent  for  B  in  E1V  models.  There  are,  however,  situations  in  which 
least  squares  can  consistently  estimate  particular  parameters  of  interest. 

One  such  situation  Is  two-class  analysis  of  covariance.  (DeGracie  and  Fuller 
(1972)  considered  this  situation  in  an  EIV  context).  The  most  Important 
parameter  is  often  the  treatment  difference;  it  turns  out  that  this  Is 
consistently  estimated  by  least  squares  as  long  as  subjects  are  assigned  to 
treatments  In  such  a  way  that  the  difference  between  the  treatment  means  of 
the  covariate  approaches  zero,  as  would  occur  in  a  randomized  or  matched 
study.  More  generally,  Gallo  (1982a)  has  shown 


Theorem  3  Let  y’  *  C Yj '  Y2'J*  where  rj  is  a  (pj  x  1)  vector  and  is  a 
(p2  x  1)  vector.  Then 


Y*  ^  ♦  y'  8  for  all  S,  E  Iff 
y;  8  y[  *12’ 


(5.1) 


J 
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The  advantage  of  such  consistent  contrasts  Is  that  we  can  consistently 
estimate  important  parameters  without  needing  replication  or  the  often 
artificial  assumption  (2.2).  The  LSE  can  only  be  considered  If  It  is  as  good 
an  estimator  as  the  EIV  methods  such  as  (2.3)  -  (2.5). 

For  example,  first  consider  Case  No.  1.  General  comparison  of  B^  and  ^  turns 
out  to  be  difficult  in  our  model,  because  the  limit  distribution  of  ^  is  not 
easily  calculated.  The  following  heuristic  calculations  are  informative. 
Suppose 

N1/Z  (y^  (XJ  XJ)"1  X{  X2  -  y£)  *  0  ;  (5.2) 

(this  implies  (5.1)).  Then 

1/2  * 

Theorem  4  Under  (5.2),  N  '  y1  (6.  -  6)  is  asymptotically  normal  with  mean 

A  *■ 

zero  and  covariance  Cov  ( 8L ,  y).  Further, 

Cov  (8L ,  y)  y'  Cov  (8^)  y,  (5.3) 

i.e.,  the  LSE  has  no  larger  asymptotic  variance  then  the  normal ity -ML E. 
Equality  holds  if  and  only  if  TeU  =  Eu  B2. 

The  proof  of  Theorem  4  is  in  Section  7.  Since  (5.2)  cannot  be  guaranteed,  the 

relevance  of  Theorem  4  is  heuristic.  For  example,  in  the  ANOCOVA  example 

mentioned  previously,  (5.2)  implies  that  the  true  mean  covariable  difference 
-1  /2 

is  o  (N  '),  which  is  not  assured  even  if  the  observed  means  are  set  equal 
for  all  N.  The  Monte-Carlo  reported  later  supports  Theorem  4,  but  other 
theoretical  calculations  are  also  possible. 

Consider  balanced  two-class  analysis  of  covariance  such  as  might  occur  In  a 
randomized  study.  In  the  notation  of  Section  2. 
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(5.3) 


v' 

*1 


X' 
*  2 


s; 


1  . 

-1  1-1 . 

[zl  Z2  . 

(u  a) . 


The  parameter  of  Interest  is  a  =  (0  1  0)  3.  If  the  design  is  actually 

2 

randomized,  the  {Z^}  would  be  i.i.d.  with  mean  uz  and  variance  o  .  This  is 
not  our  functional  model  because  Xg  is  not  a  vector  of  fixed  constants, 
nevertheless  Theorem  1  is  true  in  this  Instance.  Denoting  the  estimates  of  a 

*  *  A  A 

by  c^,  c^,  aR  and  for  the  LSE  and  (2.3)  -  (2.5)  respectively,  one  can  use 
Theorem  1  to  prove  that  a,  is  always  as  least  as  good  asymptotically  as  the 

L  *  1/2  * 

others.  If  by  an  appropriately  standardized  version  of  a  we  mean  N  7  («  -<*), 

then 


Theorem  5  Appropriately  standardized,  in  either  Case  No.  1  or  Case  No.  3,  the 

A  A  A 

LSE  always  has  smaller  asymptatic  variance  than  o^,  or  a^. 

Theorem  5  seems  to  imply  that  in  such  randomized  studies,  one  is  better  off 
not  using  EIV  techniques.  Also,  in  terms  of  inference,  detailed  calculations 
enable  one  to  prove 


Theorem  6  Consider  Case  No.  1  (this  result  holds  for  Case  No.  3  as  well 
\  J6  — —  2 

N  '  (a.  -  a)  Is  asymptotically  normal  with  mean  zero  and  variance  o,  . 

^  1/2  ^ 
be  the  usual  estimate  of  the  variance  of  N  7  (a^  -  a).  Then 


). 

*2 

Let 


“2  P  2 
oL  oL  . 


Thus,  while  errors-ln-varlables  make  the  LSE  less  efficient,  the  Inferences 
one  normally  makes  using  the  LSE  are  asymptotically  correct  for  the  treatment 
effect  In  randomized  two  class  AN0C0VA. 
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We  also  performed  a  small  Monte-Carlo  study  of  two  class  ANOCOVA  when  N=40  and 
r=s=2.  The  random  variables  for  measurement  and  equation  error  were  generated 
as  in  Section  4,  although  we  only  studied  the  case  of  no  equation  error.  The 
covariables  of  (5.3)  were  normally  distributed  with  mean  zero  and  variance  9. 
We  chose  8‘  =  (4  4  4)  so  that  a  =  4. 

In  Table  3,  for  estimating  the  treatment  effect  a  we  report  the  MSE 
efficiencies  of  the  LSE  relative  to  the  normal ity-MLE.  The  results  are 
strikingly  in  accord  with  Theorem  5. 

As  seen  in  Table  3,  even  when  there  is  no  equation  error,  the  LSE  is  much 
better  than  the  normallty-MLE  for  estimating  the  treatment  effect,  while  the 
LSE  Is  much  worse  for  estimating  the  often  less  Important  covariable  effect. 

The  preceeding  results  apply  only  to  balanced  analysis  of  covariance.  If  the 
covariates  are  not  balanced  across  treatments,  the  LSE  will  Inconsistently 
estimate  the  treatment  effect,  with  possibly  disastrous  consequences. 

6.  Proof  of  Main  Result 

In  proving  Theorem  1,  we  will  make  use  of  the  following  result. 

Lemma  1  Let  {X.}  and  {Y4}  be  two  sequences  of  random  variables,  each  i.i.d. 

"  11  2  2 

with  zero  mean,  positive  finite  variances  c  and  o  ,  respectively,  and  Cov 

*  y 

«1,  Yj)  3  6.jj  oxy.  Let  (a^>  be  a  sequence  of  constants  satisfying 

n"*  z  a,*  ■  a2  ,  o  <  a2  <  •  .  (6.1) 

1=1  1 

2  9  n  o  n  n  2 

Sn  ■%  *"»,  •  sl  *•. 

V1  j,  <*1  vv 


Then  with 


converges  In  distribution  to  a  standard  normal  random  variable. 
The  proof  is  straightforward,  and  is  contained  in  Gallo  (1982a). 
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We  now  outline  the  proof  of  Theorem  1.  Again,  complete  details  are  provided 
by  Gallo  (1982a). 

Proof  of  Theorem  1  Part  (1):  let 


[C*  Y]*  [C*  Y] 


'  [Ip2  62]  £o 


•1 


<S1  %  V"1'1  h 


-  S2  [0  6j]‘ 


The  following  representation  Is  essentially  an  extension  of  one  obtained  by 
Gleser  (1981): 


,1/2 


(W*-  E  (W*))  [6'  -1]'  +  op  (1) 

(6.2) 


as  long  as 


(W*  -  E  (W*))[B*  -1]*  =  0p  (N1/2).  (6.3) 

Thus,  finding  a  limit  distribution  for  ^  reduces  to  finding  one  for  the  term 
In  (6.3).  letting 

H  *  [X  XB],  G  =  [U*  e] 

with  Hj,  Gj  the  1th  rows  of  these  matrices,  and  noting  that  H  C 6*  -1]'  =  0, 
for  all  ye  we  have 

N 

Y*  (W*  -  E(W*))  [S'  -1]'  «  l  y'  (H,  Gi  +  G,  G]  -  E*)  [6*  -1]'. 

1-1  11  11 


i 
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If  y*  *  tB'  -1],  then  each  y'  *  0  and  limiting  normality  follows  trivially 
since  we  have  a  sum  of  i.i.d.  random  variables  with  finite  variance.  For  all 
other  y, 

_1  N  2-1 

N  Z  (Y*  H,)  =  N  y'H'Hy  ♦  y'  [In  B] *  A  [I_  B]  Y  >  0, 

1*1  1  p  p 

and  the  sequences  {G.j  [6*  -1]')  and  (y‘  (Gj  G^  -  z*)  [6*  -1]')  are  each 

I.i.d.  with  zero  mean  and  finite  variances.  Thus,  Lemma  1  applies,  and  after 

some  algebra  we  obtain 

N"1/2  (W*  -  E(W*))  -  N  (0,  d  ([Ip  6]'  A  [Ip  B]  +  Zj  +  E*  [6*  -1]'  [fi1  -1]  Zj. 
The  result  (3:1)  follows  using  (6.2),  after  some  more  algebra. 


Part  (11): 

®  *  (1*j  Ci*'  Cj*)  1  (1*j  Ci*'  V 
N1/Z  (8  -8)  -  N  (“  Ci*1  Cj*)'1  N-1/2  ”  V  (YJ  -  Cj*  B) 

*  r'1  (r-lf1  a’1  N"1/2  Cj*'  (Yj  -  C^*  B)  +  op  (1) 


as  long  as 

?.J  V  'T S  -  cr  61  *  °p 

With  Gj  *  [Uj*  Ej]  and  Ujk*' ,  Xk',  Gjk‘  the  Kth  rows  of  U^*,  X  and  Gj  ,  we  have. 


for  all  ye  JRP, 

’'  wV  <Vcj* 


N  r 

-  y'  I  ((r-1)  Xk  E  G  '  +  Z  I  U.k*  G^)  [8'  -1]'. 
k-1  K  j*l  J,c  i*j 


The*  sequences  {  Z  Gii  [8'  -1]')  and  (y*  Z  Z  U4lf*  G *1  [B1  -1]')  are  each  i.i.d. 
j-1  1*j  1K  3IC 

with  zero  means  and  finite  variances;  also, 

|| 

If1  l  ((r-1)  y'  Xfe)2- M"1  (r-1)2  y'X'Xy  ♦  (r-1)2  y'  A  y  ; 
k-1  * 
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thus  Lemma  1  applies  as  before,  and  the  result  (3.2)  follows. 


Part  (Hi):  This  is  similar  to  part  (1);  with  H  and  Gj  as  in  part  (ii),  let 


Ti*  =  r"1  <»^>  £  Gi  61  -  r_1  £  I  6|  G. 

1  i=l  1  1  i*j  1  J 


T~*  -  Z  (H  +  G .)'  (H  +  G.)  . 
c  i=l  1  1 

Analagously  to  (6.2),  we  can  obtain  the  following:  if 


(T2*  -  r(r-l)"1  T^)  [fi1  -l]1  =  0p  (NA/‘)  , 


,1/2, 


then 


N1/2  (B^-B)  =  r-1  &"1 


!P1  ° 

S3  S2 


N'1/2  n^-rfr-ir1  T^)  [ 3*  -1]'  +  0p  (1). 


For  all  Ye  JRP, 


y‘  (T 2*  -  r(r-l)"1  Tj*)  [6'  -1]'  = 
r  , 

y'  i  l  H'  G,  +  (r-1)”1  i  z  G!  GJ  tB'  -1]'. 

1*1  1  1*j  1  J 

As  before,  the  sequences 

r 

{^j  G1k  t8'  ‘1],}  and  {Y’  J  *  GikGJk  Ce'  ‘13'} 

are  l.l.d.,  each  with  zero  mean,  and  the  sequence  (y*  H^J  is  just  as  In  part 
(1)',  so  the  result  again  follows  from  lemma  1. 
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7.  Other  Details 

In  this  section  we  supply  the  details  of  the  proofs  of  Theorems  2  and  4. 


2 

Proof  of  Theorem  2  With  a6  *  0,  the  limiting  covariance  matrix  of 
* 

Br  in  (3.2)  becomes 

a'1  (r-1  d  a  +  r'V-l)-1  (dru*  +  0R))  a”1. 

Comparing  this  with  (3.3),  since  D„  is  positive  semidefinite,  it  will  suffice 

po  K 

to  show  that  for  all  ye  TR  , 


V  ([I_  8]  i"1  [I_  B,]')”1  y'  <  y‘  Y. 

P2  ?2  c  —  u 

-1 


(7.1) 


Now  since  I  can  be  expressed  as 


(a  -  e  '  i  1  r  )-i  [e  1  e  -l] 

eu  u  cu  L  eu  u 


we  have 


y  [Ip2  62irl[Ip2  «2]'  ’i’’  'u'1^ 


(7.2) 


Equation  (7.1)  now  follows  immediately  from  (7.2)  and  Graybill  (1969),  Theorem 
12.2.14  (5). 


Proof  of  Theorem  4  Recall  that 


s  -  (c;  c*)"1  (c;  y). 
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It  Is  easily  shown  that 

N-1  (V  N-1  c*'  y^B  +  ZEU* 


so 

SL  5  (A  +  Ey*)’1  UB  +  IEU*)  =  ^  ,  say. 

Let 

K  a  [^1L  ®a!3'  *  eL  *  E61L  ®2L3'  ; 

we  have 

6U  S  Bu  ,  1  ■  1,  2  (7.3) 

Thus  for  all  Ye  RP, 

y’  bl  -  Yi  a1L  +  y1  a2L 

»  y[  ex  +  Y2B2  +  (Y^X^jl-^Xg-YgHB^e^)  +  Yj  (XjXx )_1Xj(e-UB2L ) 
so  if  y  satisfies  (5.2),  using  (7.3)  we  obtain 
N1/2  y'  (6l-B)  =  N1/2  y[  (X^Xj)"1  X^  (e-UB2L)  +  op  (1). 

It  follows  that  N  '  y  (6^-8)  is  asymptotically  normal,  since  the  elements  of 
e-U82L  are  1.1  .d.  with  zero  mean  and  finite  variance,  and  the  elements  of 
y[  (XjXj)"1  XJ  satisfy  the  Noether  condition.  Also, 

Yar  (N  1/2  y|  (X^Xj)’1  X{  (c-UB21)) 

■  N  y{  (X^Xj)*1  Y  ♦  \  Yj  Ajj"1  Yj 


with  d^  •  CB2l  -13  r  CB2l  -13' 
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Noting  that  for  y  satisfying  (5.2) 
y'  a’1  y  -  y{  Ajj"1  Yj  , 

) 

we  obtain  i 

< 

N1/2  y'  (8l-B)  S  N  (0,  dL  Y*  A-1  Y).  i 

j 

Now  for  y  satisfying  (5.2)  the  limit  variance  of  ^  In  (3.1)  becomes  d  y'a”1  y,  \ 

so  we  need  only  show  that  _<  d.  With  f 

H  *  -  ijj  4jj  1  4j2  1 

I,)'1  tiu  ttu3  t Sg  -»]•  I 

we  can,  after  some  algebra  show  that  f 

<H  3  Q'  (2M  +  Iu)  q  J 

which  Is  non-negative  for  all  q  since  M  and  are  positive  f 

definite;  d-d^  only  If  q  *  0,  that  is,  only  If  Eeu  =  Eu32,  | 

completing  the  proof.  f 
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TABLE  3 


Efficiencies  of  the  LSE  Relative  to  the  Normal Ity-MLE 
for  the  ANOCOVA  Experiment 


X,u  Measurement 
Error  Distribution 

0 

u 

MSE  Efficiency, 
Treatment  Effect 

MSE  Efficiency, 
Covarlable  Effect 

N 

1 

1.02 

0.80 

N 

2 

1.07 

0.36 

N 

4 

1.34 

0.15 

CN 

1 

1.03 

0.65 

CN 

2 

1.16 

0.29 

CN 

4 

1.78 

0.21 
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